Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: OCPNODE-2462 - Split Filesystem OCP Enhancement #1657

Closed
wants to merge 3 commits into from

Conversation

kannon92
Copy link

@kannon92 kannon92 commented Jul 31, 2024

This is a OCP enhancement for enabling KEP-4191 in openshift.

@openshift-ci openshift-ci bot requested review from jerpeter1 and runcom July 31, 2024 21:19
Copy link
Contributor

openshift-ci bot commented Jul 31, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ashcrow for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kannon92 kannon92 force-pushed the split-filesystem branch 2 times, most recently from eb11659 to 957070f Compare July 31, 2024 21:36
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
- https://github.com/containers/storage/pull/1885

- How does one delete all images and containers once the container runtime config is changed?
- crictl on all images and containers on each node?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be an out-of-scope item. There is an argument that externally managed storage is owned by the customer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main reason for listing this out is what happens if you change from no image store to a new location for image store. Kubelet GC of the existing images would not clean that up.

And tbh crictl wouldn't either. I guess we would require manual deletion if they ran out of disk space.

enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
@rphillips
Copy link
Contributor

rphillips commented Jul 31, 2024

Technically, I think we should reserve, IMAGE_STORE as a partition label.
This document could go into this. The script can attempt to mount by label the IMAGE_STORE partition.

The split-filesystem.sh script would:

  • mount by label the partition named IMAGE_STORE
  • check to see if the relabelling has been done
    • if not, relabel the filesystem
      • touch a file within mounted filesystem on completion of the relabelling (ie: .relabel-complete)

Node Team's scope starts at IMAGE_STORE partition label... This partition label could be setup by CAPI or by documentation or by some other external entity.


```storage.bu
variant: openshift
version: 4.14.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work with 4.17.0? if so we should use the newest version IMO

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough I only got this to work with 4.14.


Since the image cache has changed locations, all the old images left over should be removed.

Simplest option is to remove the images on each node that this feature was enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should spell out what to do with the containers (which would be left over after the reboot), especially because a lot of those containers will require a running network pod to remove

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this sorta as an open question.

Do we manually delete all of /var/lib/container/storage on enablement of this feature or disable?

I think using a tool like crictl may actually not work if we change the container runtime. For podman changing these locations does orphan those containers/images.

I think a manual rm may be the best thing we can do.. Unless we delete before enablement of the feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want the old images removed?

enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved

Since the image cache has changed locations, all the old images left over should be removed.

Simplest option is to remove the images on each node that this feature was enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want the old images removed?

enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
enhancements/kubelet/split-filesystem.md Outdated Show resolved Hide resolved
@rphillips
Copy link
Contributor

We are going to take this enhancement to a meeting to discuss further design.

@kannon92 kannon92 changed the title OCPNODE-2462 - Split Filesystem OCP Enhancement WIP: OCPNODE-2462 - Split Filesystem OCP Enhancement Aug 6, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 6, 2024
@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 3, 2024
@kwilczynski
Copy link
Member

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 3, 2024
Copy link
Contributor

openshift-ci bot commented Sep 20, 2024

@kannon92: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 19, 2024
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 26, 2024
@openshift-bot
Copy link

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Copy link
Contributor

openshift-ci bot commented Nov 3, 2024

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot closed this Nov 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants